{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.chdir('../.././UPCnlp')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tudouNLP.models.predict as p\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 情感分析\n",
    "第一句话为模棱两可的语句的情感分析，第二个例句为常规语句，可以看到，模型精确的捕捉了句子的语义信息\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[('Neutral:', 0.84711903), ('Positive:', 0.9965366)]\n"
     ]
    }
   ],
   "source": [
    "texts=\"今天心情很不好，但还好最后的结果是好的。 今天天气有点好啊\".split(' ')\n",
    "predictor=p.sentence('./models/bert_model')\n",
    "result=predictor.sentiment(texts)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 分词"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[['在', '山东青岛', '，', '位于', '长江', '西路', '的', '中国石油大学教务处', '，', '王新哲', '、', '操', '应', '长', '、', '查明', '正在', '参观', '邃', '智', '科技有限公司', '！'], ['德玛西亚', '带领', '着', '他', '的', '子民', '在', '瓦罗兰大陆', '，', '与', '瑞', '兹', '展开', '了', '殊死搏斗']]\n"
     ]
    }
   ],
   "source": [
    "texts=[\"在山东青岛，位于长江西路的中国石油大学教务处，王新哲、操应长、查明正在参观邃智科技有限公司！\",\"德玛西亚带领着他的子民在瓦罗兰大陆，与瑞兹展开了殊死搏斗\"]\n",
    "predictor=p.tagger('./model')\n",
    "result=predictor.cut(texts)\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 词性标注"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[('在', 'P'),\n",
       "  ('山东青岛', 'NS'),\n",
       "  ('，', 'W'),\n",
       "  ('位于', 'V'),\n",
       "  ('长江', 'NS'),\n",
       "  ('西路', 'N'),\n",
       "  ('的', 'UDE1'),\n",
       "  ('中国石油大学教务处', 'NT'),\n",
       "  ('，', 'W'),\n",
       "  ('王新哲', 'NR'),\n",
       "  ('、', 'W'),\n",
       "  ('操', 'V'),\n",
       "  ('应', 'V'),\n",
       "  ('长', 'A'),\n",
       "  ('、', 'W'),\n",
       "  ('查明', 'V'),\n",
       "  ('正在', 'D'),\n",
       "  ('参观', 'V'),\n",
       "  ('邃', 'NT'),\n",
       "  ('智', 'NG'),\n",
       "  ('科技有限公司', 'NIS'),\n",
       "  ('！', 'W')],\n",
       " [('德玛西亚', 'NRF'),\n",
       "  ('带领', 'V'),\n",
       "  ('着', 'UZHE'),\n",
       "  ('他', 'RR'),\n",
       "  ('的', 'UDE1'),\n",
       "  ('子民', 'N'),\n",
       "  ('在', 'P'),\n",
       "  ('瓦罗兰大陆', 'NZ'),\n",
       "  ('，', 'W'),\n",
       "  ('与', 'CC'),\n",
       "  ('瑞', 'NRF'),\n",
       "  ('兹', 'RG'),\n",
       "  ('展开', 'V'),\n",
       "  ('了', 'ULE'),\n",
       "  ('殊死搏斗', 'NZ')]]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "predictor=p.tagger('./model')\n",
    "predictor.cut(texts,'posseg')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 命名实体识别\n",
    "1. 选取了具有代表性的人名进行识别，操老师姓氏比较少见，一般的工具很难识别，而査老师的名字一般以动词的形式出现，更加难以准确识别，但该项目可以很好的捕获到句子的语义信息，可以精确识别\n",
    "2. 组织名与地名也选取了非常不常见的进行测试，可以看到得到了很精确的结果"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(['王新哲', '操应长', '查明'], ['山东', '青岛', '长江西路'], ['中国石油大学教务处', '邃智科技有限公司']),\n",
       " (['德玛西亚', '瑞兹'], ['瓦罗兰大陆'], [])]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "predictor=p.tagger('./model')\n",
    "predictor.ner(texts)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
